124 research outputs found
Parallel Data Helps Neural Entity Coreference Resolution
Coreference resolution is the task of finding expressions that refer to the
same entity in a text. Coreference models are generally trained on monolingual
annotated data but annotating coreference is expensive and challenging.
Hardmeier et al.(2013) have shown that parallel data contains latent anaphoric
knowledge, but it has not been explored in end-to-end neural models yet. In
this paper, we propose a simple yet effective model to exploit coreference
knowledge from parallel data. In addition to the conventional modules learning
coreference from annotations, we introduce an unsupervised module to capture
cross-lingual coreference knowledge. Our proposed cross-lingual model achieves
consistent improvements, up to 1.74 percentage points, on the OntoNotes 5.0
English dataset using 9 different synthetic parallel datasets. These
experimental results confirm that parallel data can provide additional
coreference knowledge which is beneficial to coreference resolution tasks.Comment: camera-ready version; to appear in the Findings of ACL 202
A Document-Level SMT System with Integrated Pronoun Prediction
This paper describes one of Uppsala University’s submissions to the pronoun-focused machine translation (MT) shared task at DiscoMT 2015. The system is based on phrase-based statistical MT implemented with the document-level decoder Docent. It includes a neural network for pronoun prediction trained with latent anaphora resolution. At translation time, coreference information is obtained from the Stanford CoreNLP system
Exploring Predictive Uncertainty and Calibration in NLP: A Study on the Impact of Method & Data Scarcity
We investigate the problem of determining the predictive confidence (or,
conversely, uncertainty) of a neural classifier through the lens of
low-resource languages. By training models on sub-sampled datasets in three
different languages, we assess the quality of estimates from a wide array of
approaches and their dependence on the amount of available data. We find that
while approaches based on pre-trained models and ensembles achieve the best
results overall, the quality of uncertainty estimates can surprisingly suffer
with more data. We also perform a qualitative analysis of uncertainties on
sequences, discovering that a model's total uncertainty seems to be influenced
to a large degree by its data uncertainty, not model uncertainty. All model
implementations are open-sourced in a software package
On Statistical Machine Translation and Translation Theory
The translation process in statistical machine translation (SMT) is shaped by technical constraints and engineering considerations. SMT explicitly models translation as search for a target-language equivalent of the input text. This perspective on translation had wide currency in mid-20th century translation studies, but has since been superseded by approaches arguing for a more complex relation between source and target text. In this paper, we show how traditional assumptions of translational equivalence are embodied in SMT through the concepts of word alignment and domain and discuss some limitations arising from the word-level/corpus-level dichotomy inherent in these concepts
On Statistical Machine Translation and Translation Theory
Abstract The translation process in statistical machine translation (SMT) is shaped by technical constraints and engineering considerations. SMT explicitly models translation as search for a target-language equivalent of the input text. This perspective on translation had wide currency in mid-20th century translation studies, but has since been superseded by approaches arguing for a more complex relation between source and target text. In this paper, we show how traditional assumptions of translational equivalence are embodied in SMT through the concepts of word alignment and domain and discuss some limitations arising from the word-level/corpus-level dichotomy inherent in these concepts
Improving Machine Translation Quality Prediction with Syntactic Tree Kernels
We investigate the problem of predicting the quality of a given Machine Translation (MT) output segment as a binary classification task. In a study with four different data sets in two text genres and two language pairs, we show that the performance of a Support Vector Machine (SVM) classifier can be improved by extending the feature set with implicitly defined syntactic features in the form of tree kernels over syntactic parse trees. Moreover, we demonstrate that syntax tree kernels achieve surprisingly high performance levels even without additional features, which makes them suitable as a low-effort initial building block for an MT quality estimation system
- …